Search CORE

15 research outputs found

Towards Practical Control of Singular Values of Convolutional Layers

Author: Bulatova Ekaterina
Obukhov Anton
Rakhuba Maxim
Senderovich Alexandra
Publication venue
Publication date: 24/11/2022
Field of study

In general, convolutional neural networks (CNNs) are easy to train, but their essential properties, such as generalization error and adversarial robustness, are hard to control. Recent research demonstrated that singular values of convolutional layers significantly affect such elusive properties and offered several methods for controlling them. Nevertheless, these methods present an intractable computational challenge or resort to coarse approximations. In this paper, we offer a principled approach to alleviating constraints of the prior art at the expense of an insignificant reduction in layer expressivity. Our method is based on the tensor-train decomposition; it retains control over the actual singular values of convolutional mappings while providing structurally sparse and hardware-friendly representation. We demonstrate the improved properties of modern CNNs with our method and analyze its impact on the model performance, calibration, and adversarial robustness. The source code is available at: https://github.com/WhiteTeaDragon/practical_svd_convComment: Published as a conference paper at NeurIPS 202

arXiv.org e-Print Archive

TT-NF: Tensor Train Neural Fields

Author: Obukhov Anton
Sakaridis Christos
Schindler Konrad
Usvyatsov Mikhail
Van Gool Luc
Publication venue
Publication date: 30/09/2022
Field of study

Learning neural fields has been an active topic in deep learning research, focusing, among other issues, on finding more compact and easy-to-fit representations. In this paper, we introduce a novel low-rank representation termed Tensor Train Neural Fields (TT-NF) for learning neural fields on dense regular grids and efficient methods for sampling from them. Our representation is a TT parameterization of the neural field, trained with backpropagation to minimize a non-convex objective. We analyze the effect of low-rank compression on the downstream task quality metrics in two settings. First, we demonstrate the efficiency of our method in a sandbox task of tensor denoising, which admits comparison with SVD-based schemes designed to minimize reconstruction error. Furthermore, we apply the proposed approach to Neural Radiance Fields, where the low-rank structure of the field corresponding to the best quality can be discovered only through learning.Comment: Preprint, under revie

arXiv.org e-Print Archive

Breathing New Life into 3D Assets with Generative Repainting

Author: Kanakis Menelaos
Obukhov Anton
Schindler Konrad
Van Gool Luc
Wang Tianfu
Publication venue
Publication date: 18/10/2023
Field of study

Diffusion-based text-to-image models ignited immense attention from the vision community, artists, and content creators. Broad adoption of these models is due to significant improvement in the quality of generations and efficient conditioning on various modalities, not just text. However, lifting the rich generative priors of these 2D models into 3D is challenging. Recent works have proposed various pipelines powered by the entanglement of diffusion models and neural fields. We explore the power of pretrained 2D diffusion models and standard 3D neural radiance fields as independent, standalone tools and demonstrate their ability to work together in a non-learned fashion. Such modularity has the intrinsic advantage of eased partial upgrades, which became an important property in such a fast-paced domain. Our pipeline accepts any legacy renderable geometry, such as textured or untextured meshes, orchestrates the interaction between 2D generative refinement and 3D consistency enforcement tools, and outputs a painted input geometry in several formats. We conduct a large-scale study on a wide range of objects and categories from the ShapeNetSem dataset and demonstrate the advantages of our approach, both qualitatively and quantitatively. Project page: https://www.obukhov.ai/repainting_3d_asset

arXiv.org e-Print Archive

Learning to Relate Depth and Semantics for Unsupervised Domain Adaptation

Author: Chen Yuhua
Georgoulis Stamatios
Kanakis Menelaos
Obukhov Anton
Paudel Danda Pani
Saha Suman
Van Gool Luc
Publication venue
Publication date: 01/01/2021
Field of study

We present an approach for encoding visual task relationships to improve model performance in an Unsupervised Domain Adaptation (UDA) setting. Semantic segmentation and monocular depth estimation are shown to be complementary tasks; in a multi-task learning setting, a proper encoding of their relationships can further improve performance on both tasks. Motivated by this observation, we propose a novel Cross-Task Relation Layer (CTRL), which encodes task dependencies between the semantic and depth predictions. To capture the cross-task relationships, we propose a neural network architecture that contains task-specific and cross-task refinement heads. Furthermore, we propose an Iterative Self-Learning (ISL) training scheme, which exploits semantic pseudo-labels to provide extra supervision on the target domain. We experimentally observe improvements in both tasks' performance because the complementary information present in these tasks is better captured. Specifically, we show that: (1) our approach improves performance on all tasks when they are complementary and mutually dependent; (2) the CTRL helps to improve both semantic segmentation and depth estimation tasks performance in the challenging UDA setting; (3) the proposed ISL training scheme further improves the semantic segmentation performance. The implementation is available at https://github.com/susaha/ctrl-uda.Comment: Accepted at CVPR 2021; updated results according to the released source cod

arXiv.org e-Print Archive

Repository for Publications and Research Data

DiffDreamer: Consistent Single-view Perpetual View Generation with Conditional Diffusion Models

Author: Cai Shengqu
Chan Eric Ryan
Obukhov Anton
Peng Songyou
Shahbazi Mohamad
Van Gool Luc
Wetzstein Gordon
Publication venue
Publication date: 22/11/2022
Field of study

Perpetual view generation -- the task of generating long-range novel views by flying into a given image -- has been a novel yet promising task. We introduce DiffDreamer, an unsupervised framework capable of synthesizing novel views depicting a long camera trajectory while training solely on internet-collected images of nature scenes. We demonstrate that image-conditioned diffusion models can effectively perform long-range scene extrapolation while preserving both local and global consistency significantly better than prior GAN-based methods. Project page: https://primecai.github.io/diffdreamer

arXiv.org e-Print Archive

Quantum Imaging with Incoherently Scattered Light from a Free-Electron Laser

Author: A Aquila
A Barty
A Classen
A Singer
Adrian Benz
Anton Classen
Birgit Fischer
C Kupitz
C Thiel
Daniel Bhatti
Felix Waldmann
G Baym
Giuseppe Mercurio
Günter Brenner
HN Chapman
HN Chapman
HN Chapman
I Barke
Ivan A. Vartanyants
Ivan Zaluzhnyy
JM Slovik
Joachim von Zanthier
Jochen Wagner
JW Goodman
JW Goodman
K Ayyer
Kai Schlage
Lars Bocklage
Lukas Wenthaus
M Seibert
ND Loh
Oleg Gorobtsov
OY Gorobtsov
OY Gorobtsov
Petr Skopintsev
R Hanbury Brown
R Hanbury Brown
R Neutze
Raimund Schneider
Ralf Röhlsberger
RJ Glauber
RJ Glauber
S Oppel
Sergey Lazarev
Svenja Willing
Thomas Mehringer
W Ackermann
Wilfried Wurth
Y Takahashi
Yuri Obukhov
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

The advent of accelerator-driven free-electron lasers (FEL) has opened new avenues for high-resolution structure determination via diffraction methods that go far beyond conventional x-ray crystallography methods. These techniques rely on coherent scattering processes that require the maintenance of first-order coherence of the radiation field throughout the imaging procedure. Here we show that higher-order degrees of coherence, displayed in the intensity correlations of incoherently scattered x-rays from an FEL, can be used to image two-dimensional objects with a spatial resolution close to or even below the Abbe limit. This constitutes a new approach towards structure determination based on incoherent processes, including Compton scattering, fluorescence emission or wavefront distortions, generally considered detrimental for imaging applications. Our method is an extension of the landmark intensity correlation measurements of Hanbury Brown and Twiss to higher than second-order paving the way towards determination of structure and dynamics of matter in regimes where coherent imaging methods have intrinsic limitations

arXiv.org e-Print Archive

DESY Publication Database

Crossref

DESY

Tensor Decompositions in Deep Learning

Author: Obukhov Anton
Publication venue: ETH Zurich
Publication date: 31/10/2023
Field of study

Tensor Decompositions is a subdomain of multilinear algebra concerned with dimensionality reduction and analysis of multi-dimensional arrays (tensors). The field has numerous applications in physics, chemistry, life sciences, and recently, machine learning, computer vision, and graphics. Despite the maturity of the field, much progress happened in the last years due to affordable parallel compute, driving empirical research. Deep Learning is a young subdomain of machine learning concerned with fitting deep, non-linear parametric models in a non-convex optimization setting with abundant data. The tipping point of interest in deep learning happened when a neural network (AlexNet) set a record-high score on a popular image classification benchmark (ImageNet), thus promising to solve long-standing computer vision problems. Over the past years, most breakthroughs in deep learning happened by finding smarter ways to increase model size and complexity. However, the need to deploy deep models on edge devices, such as for computational photography on mobile phones, has set a new direction for finding lean models. On the other hand, many high-potential deep learning techniques, such as Neural Radiance Fields (NeRF), or vision transformers, leave a huge margin for improvement upon inception. In this thesis, we investigate the use of tensor decompositions in the context of modern deep learning techniques. We aim to improve various types of efficiency: memory footprint and runtime performance, measured in parameters and floating-point operations (FLOPs), respectively. We begin by exploring neural network layer compression schemes and propose a tensorized representation with a basis tensor shared among layers and per-layer coefficients. Subsequently, we study the manifold of Tensor Train (TT) of fixed rank in the context of parameterizing layers of Generative Adversarial Networks (GANs) and demonstrate the ability to compress networks while maintaining the stability of training. Finally, we utilize TT-parameterization to learn compressed NeRFs and devise sampling schemes with support for automatic differentiation to facilitate training. Unlike most previous works on tensor decompositions, we treat decompositions as models in the deep learning sense and update their parameters through backpropagation and optimization. Like prior art, tensorized formats admit to certain algebraic operations, making them an appealing entity at the intersection of two prominent research directions

Repository for Publications and Research Data